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ABSTRACT 


Data  association  and  track  association  algorithms  have  been  developed  over  the  course  of  thirty 
years.  Almost  all  technical  papers  that  describe  these  association  algorithms  have  begun  the  derivations 
by  adopting  log-likelihood  ratios  as  the  measure  of  association.  At  best,  this  starting  point  has  obscured 
the  assumptions  necessary  to  use  log-likelihood  ratios.  At  worst,  the  log-likelihood  ratios  have  been 
improperly  defined.  This  report  provides  the  first  known  derivation  of  a  track  association  algorithm  from 
the  first  principles  of  Bayesian  probability  theory.  By  starting  with  first  principles,  all  the  assumptions 
that  are  necessary  to  derive  an  association  algorithm  are  explicitly  stated  as  the  derivation  proceeds.  The 
correct  form  for  the  log-likelihood  ratios  is  obtained  later  in  the  derivation  and  can  be  traced  back  to  first 
principles.  The  pitfalls  and  deficiencies  of  poorly  performing  association  algorithms  are  identified  easily 
by  comparing  the  algorithms  with  the  full  derivation.  These  deficiencies  arise  from  such  mistakes  as  the 
incorrect  definition  of  the  log-likelihood  ratio,  poor  selection  of  the  probability  density  functions, 
incorrect  construction  of  the  cost  matrices,  and  the  application  of  an  algorithm  to  a  system  that  violates 
the  assumptions  that  were  adopted  during  the  algorithm  construction.  In  addition,  the  firm  grounding  in 
Bayesian  probability  theory  provides  the  means  to  easily  extend  the  derivation  to  produce  more  complex 
association  algorithms,  such  as  feature-aided  track  association  algorithms.  The  basic  derivation  provided 
in  this  report  makes  it  clear  that  the  ensemble  of  track  association  algorithms  is  much  more  extensive  that 
most  data  fusion  researchers  would  believe.  These  algorithms  can  be  created  by  simply  changing  any  of 
the  derivation  assumptions  or  the  probability  density  functions. 
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1.  INTRODUCTION 


During  the  critical  review  of  a  track  association  algorithm,  a  number  of  severe  flaws  were 
discovered  in  the  algorithm  design  that  would  prevent  the  algorithm  from  delivering  robust  performance. 
In  the  course  of  working  on  track  association  for  a  fielded,  real-time,  data  fusion  system  many  pitfalls 
were  encountered.  At  the  time  of  the  review,  it  was  believed  that  the  state  of  the  art  in  the  construction  of 
track  association  algorithms  had  advanced  to  the  point  that  the  deficiencies  in  the  reviewed  algorithm 
would  have  been  avoided  during  the  design  phase.  While  reviewing  the  literature  on  track  association 
algorithms,  it  was  found  that  the  state  of  the  art  was  not  as  advanced  as  had  been  supposed.  Different 
approaches  by  different  authors  [1],  [2],  [3],  [4],  [5],  [6],  [7]  were  found  that  could  not  be  considered  a 
unified  approach  to  track  association,  in  spite  of  a  40-year  history  of  development  in  the  field  [8],  [9].  To 
resolve  the  disagreements  between  the  algorithm  reported  here  and  those  of  various  authors,  the  track 
association  algorithm  was  rederived  by  starting  with  a  basic  probability  equation.  The  inspiration  for  this 
approach  was  a  graduate  thesis  [10].  This  derivation  shows  that  many  of  the  track  association  algorithms 
are  only  partially  correct  (including  that  reported  here)  and  suffer  from  deficiencies.  In  many  cases, 
fortuitous  selection  of  additional  assumptions,  such  as  the  form  of  the  prior  probabilities,  led  to  some 
algorithms  performing  correctly  for  the  intended  application.  The  derivation  also  gives  insight  into  why 
some  of  the  algorithms  did  not  perform  as  well  as  expected  and  led  to  some  authors  choosing  to  adopt 
tuning  parameters  to  get  better  performance.  This  new  derivation  lays  out  all  the  assumptions  that  are 
required  to  attain  a  straightforward  track  association  algorithm  and  also  provides  a  template  for  others 
who  may  wish  to  adopt  assumptions  that  are  different  from  those  adopted  here. 

Most  track  association  algorithms  can  be  decomposed  into  a  two-step  algorithm.  The  first  step  is  to 
construct  a  matrix  of  association  costs  between  two  sets  of  tracks.  The  second  step  usually  consists  of 
running  a  linear  assignment  algorithm  on  the  matrix  to  determine  the  optimal  track  associations,  in  terms 
of  the  overall  cost.  The  second  step  is  well  understood  with  the  discovery  of  optimal  linear  assignment 
algorithms  such  as  the  Munkres  [11],  Jonker-Volgenant  [12],  and  Jonker-Volgenant-Canstenon  [1],  [3]. 
These  algorithms  have,  for  the  most  part,  supplanted  less  efficient  and  suboptimal  algorithms  such  as  the 
nearest-neighbors  and  the  greedy.  The  construction  of  the  cost  matrix  is  less  well  understood  and  the  step 
on  which  this  report  focuses. 

This  report  presents  the  detailed  derivation  of  a  track  association  algorithm  and  clearly  states  all 
assumptions  as  they  are  made.  Most  previous  derivations  have  been  started  with  the  use  of  likelihood 
ratios,  obscuring  most  of  the  assumptions  that  are  necessary  for  the  construction  of  an  assignment 
algorithm.  The  approach  taken  here  is  to  start  with  a  simple  probability  function  that  is  dependent  on  the 
hypothetical  tracks,  the  data,  and  prior  information.  The  axioms  and  theorems  of  Bayesian  probability 
theory  [13]  are  then  used  to  expand  the  probability  function  into  a  product  of  simplified  probability 
functions.  The  full  derivation  clearly  outlines  how  to  develop  variations  and  extensions  to  the  standard 
metric  track  association  algorithm.  One  variation  of  current  interest  to  researchers  is  the  extension  of  the 
track  association  algorithm  from  metrics-only  to  feature-aided.  The  simplest  of  these  extensions  are 
presented  in  this  report.  More  advanced  feature-aided  track  association  algorithms  require  specific 
knowledge  about  the  types  of  sensors  and  the  types  of  targets  that  are  involved  in  the  association. 
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2.  INITIAL  DERIVATION 


An  algorithm  design  that  is  based  upon  a  firm  theoretical  foundation  avoids  many  of  the  problems 
that  are  almost  always  encountered  if  the  design  is  based  upon  ad  hoc  principles.  This  statement  is 
especially  true  for  track  association  algorithms.  The  theoretical  foundation  provides  information  to  the 
algorithm  designer  on  how  to  construct  track  association  algorithms,  allows  the  designer  to  know  when  he 
is  departing  from  theory,  and  allows  the  designer  to  understand  why  a  track  association  algorithm  might 
be  failing. 

Probability  theory  is  the  theoretical  foundation  for  the  track  association  algorithm  derived  here.  The 
starting  point  could  be  called  the  probability  of  everything,  given  as 

P{H,D,I)  ,  (1) 

where  H  is  a  hypothesis,  D  are  the  data,  and  7  is  the  prior  information.  The  probability  of  the  prior 
information  can  be  expanded  into  two  probabilities, 

P{H,D,I)=P{H,D\I)P{I)  .  (2) 

where  one  is  a  conditional  probability,  dependent  upon  the  parameter  of  the  unconditional  probability. 
This  expansion  is  based  on  a  standard  axiom  of  probability  theory. 

One  can  work  toward  Bayes’  rule  by  generating  the  equality, 

P(H  |  D,I)P(D  |  I)P{I)  =  P(D  |  H,l)P(H  1 1 )P(l )  ,  (3) 

with  additional  expansion.  This  relationship  equates  two  different  expansions  of  the  conditional 
dependencies  for  the  data  and  the  hypothesis. 

The  identical  P(l)  terms  can  be  canceled  from  both  sides  of  the  equality,  and  both  sides  can  be 
divided  by  P(D  \  i)  to  arrive  at  a  formulation  of  Bayes  rule,  where  the  probability  of  the  prior 
information  is  not  considered  to  be  important: 

Equation  (4)  gives  a  function  for  the  probability  of  the  hypothesis,  conditioned  on  the  data  and  the  prior 
information.  The  goal  is  to  find  the  hypothesis  that  maximizes  this  probability,  conditioned  on  the 
measured  data  and  the  prior  information.  The  denominator  on  the  right  is  usually  calculated  with  the 
equation 
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P(D\I)='YJP(D\H’I)P(H\I)  •  (5) 

H 

although  the  term  can  be  ignored  for  most  hypothesis  ensembles  because  it  is  often  a  normalization  term. 

The  specific  problem  of  interest  is  to  determine  the  best  association  between  the  sets  of  tracks  from 
two  different  sensors.  Association  is  not  considered  between  more  than  two  sensors  for  this  derivation. 
The  track  sets  are  defined  as  £>,  and  D2 .  The  basic  equation  expands  to 


P(HID  D  a  Hd,.d>W)p(h\i) 


(6) 


A  complication  with  track  association  is  that  one  sensor  may  track  an  object  that  the  other  sensor  does  not 
detect.  Missing  data  need  to  be  represented  in  our  probability  equation,  just  like  measured  data  are 
represented.  We  will  define  missing  data  as  Ds,  where  the  subscript  s  denotes  either  sensor  one  or 

sensor  two.  The  parameters  expand  further  to 


p,„ln  77  n  77  ,7  /fa,  P|,Q2,ZX  I H,i)p(H |  /) 
P(H\Dl,D„DJ,D2,I)  p(b„51>C,.Oj|/) 

The  equation  can  continue  to  be  expanded  to  additional  conditional  dependencies, 

P{H'DJ) - TOP) 


(7) 


(8) 


and  to 


_ '  f91 

P(p,A,Di,D2  |/)  '  U 

The  term  on  the  left  has  been  simplified  to  keep  the  equation  to  one  line. 

The  first  assumption  made  is  that 

(Al)  the  tracks  detected  by  one  sensor  are  statistically  independent  of  the  tracks  detected  by  the 
other  sensor. 


The  equation  simplifies  slightly  to 

_ _ _  p(d, ,D1\Dl,D2,H, l]P(D,  I  H, /)/>(£>,  I g, /Mg  I') 

P(H  lJ)-  P(D„zT„.D2,Z>2  |/) 


(10) 
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The  missing  data  terms  are  expanded  to  additional  conditional  dependencies: 


_ '  f(o,  I  D11D11B11HJ)p{di  I  D11D11Hj)m  I  W,/)P(P,  I  HjWjU) 

WXm  7)  ' 

The  second  assumption  made  is  that 

(A2)  the  tracks  missed  by  one  sensor  are  statistically  independent  of  the  tracks  missed  or  detected 
by  the  other  sensor, 

which  leads  to 


_ s  P0,  |P„g,/)p(flJ  A.w./)P(P|  W)Hp2  \JU]P(SjJ)  .... 

HHlDJ) - TOw]  '  (  ’ 

A  reason  to  keep  the  missing  data  conditionally  dependent  on  the  measured  data  of  the  same  sensor 
is  that  it  can  account  for  the  case  when  there  are  multiple,  closely  spaced,  hypothetical  objects,  and  the 
sensor  measures  the  multiple,  closely  spaced  objects  as  one  detected  object  because  of  the  sensor 
resolution.  An  approximation  for  target  association  might  consider  this  case  to  be  equivalent  to  the  case 
when  only  one  hypothetical  object  was  detected  and  the  other  hypothetical  objects  were  not.  A  different 
approach  to  handling  hypothetical,  closely  spaced  objects  is  to  allow  for  the  possibility  that  a  detected 
track  may  represent  multiple  hypothetical  objects.  The  P{D  |  H,I )  terms  will  have  to  be  constructed  to 
account  for  the  possibility  that  a  single  detected  object  might  be  multiple,  unresolved,  hypothetical 
objects.  Instead  of  worrying  about  the  additional  complexity  that  unresolved  hypothetical  objects  add  to 
the  derivation  of  a  track  association  algorithm,  it  is  assumed  that 

(A3)  missing  data  are  not  conditionally  dependent  on  the  measured  data  for  a  sensor. 

The  equation  simplifies  to 


p,„  .  .  „  PjP,  |  I  HJXd,  I  |  H,I)P(H  1 1) 

- PiD^.D^U)  ■  <13) 

Though  far  from  obtaining  a  track  association  algorithm,  one  can  account  for  three  assumptions  that 
were  about  the  nature  of  the  sensor  data  and  the  hypothetical  objects.  More  assumptions  will  be 
made  before  an  association  algorithm  is  finally  produced. 

Because  the  problem  considered  here  involves  associating  multiple  targets  from  two  sensors,  details 
are  added  to  the  data  and  the  possible  hypotheses.  The  sensor  data  and  the  hypothesis  can  be  considered 
to  consist  of  a  set  of  objects: 
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(14) 


D\  —  \dx  i  (x),  dX2  (x),  dl  3  (x), . . . ,  dXJ  (x)}  , 

D2  ={d2i{x\d22(x),d23{x\...  ,d2J{xft  ,  (15) 

H  =  \hx  (x),  h2  (x),  h3(x\...,hK(x),  w(x)},  a  ({/,  ^  }  {/>  Az  1  &  w))  >  (16) 

with  I  objects  detected  by  sensor  one,  J  objects  detected  by  sensor  two,  and  K  hypothetical  objects.  A 
hypothetical  noise  variable  «(x)  is  defined  momentarily,  to  allow  for  the  possibility  thatjome  detected 
objects  may  really  be  due  to  noise.  The  hypothesis  also  includes  a  function  a({z,  D\  j>  {,/ *  D2  ^i)  that 
represents  the  association  between  the  detected  and  missed  objects  in  the  two  sensor  sets,  the  hypothetical 
objects,  and  the  noise.  The  variable  x  represents  the  global  feature  space  in  which  the  sensor 
measurements  are  made.  Not  all  data  and  noise  functions  necessarily  depend  upon  all  the  dimensions  of 
the  global  feature  space. 

To  progress  further  toward  a  track  association  algorithm,  additional  statistical  independence  (SI) 
and  conditional  dependence  (CD)  assumptions  are  made: 

(A4)  the  dXi{x)  data  are  SI  of  the  d2j  (x)  data, 

(A5)  the  du{x)  objects  are  SI  of  each  other, 

(A6)  the  d2J(x)  objects  are  SI  of  each  other, 

(A7)  the  hh  (x)  objects  are  SI  of  each  other, 

(A8)  any  one  dsi  (x)  for  any  given  sensor  is  CD  on  only  one  hh  (x), 

(A9)  no  dsi  (x)  are  associated  with  noise  «(x),  and 

(A10)  all  hypothetical  objects  hh(x)  are  detected  by  at  least  one  sensor. 

Assumption  (A9)  means  that  the  hypothetical  noise  element  can  be  neglected.  Assumption  (A  10) 
prohibits  hypothesizing  the  existence  of  objects  that  are  not  detected  by  any  of  the  sensors.  There  are 
cases  where  hypothesizing  the  existence  of  undetected  objects  might  be  sensible  because  the  prior 
information  might  support  the  existence  of  these  objects.  This  current  derivation  does  not  consider  this 
additional  complication. 

Assumptions  (A9)  and  (A  10)  impose  a  constraint  on  the  number  of  hypothetical  objects, 

ma x(l,j)<K<I  +  J  .  (17) 

Every  detected  object  must  match  a  hypothetical  object.  A  hypothetical  object  is  associated  with  either  a 
detected  object  from  only  one  given  sensor  or  detected  objects  from  both  sensors.  The  current  task  is  to 
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determine  the  most  probable  association  between  the  hypothetical  objects  and  the  detected  objects  from 
the  two  sensors,  which  indirectly  provides  for  the  association  between  the  two  sets  of  detected  objects. 

The  denominator  is  neglected  in  the  reported  probability  function;  proportionality  between 
hypotheses  is  relied  upon  to  search  for  the  most  probable  hypothesis, 

P(H  I  A/)-  p{d,  I  H.1W2 1  I  H,I)P(D1  I  H,l)P(H  1 1)  .  (18) 

This  simplification  is  reasonable  because  the  denominator  is  independent  of  the  hypotheses  under 
consideration  for  the  association  problem. 

The  next  step  is  to  expand  the  conditional  probabilities  for  the  individual  objects  and  impose 
assumptions  (A4)  through  (A10).  The  resulting  equation  contains  probabilities  for  associations  that  can  be 
grouped  into  three  classes:  hypothetical  objects  detected  by  both  sensors  (q:  1,2  ),  hypothetical  objects 
detected  by  sensor  one  but_not  by  sensor  two  (r :  1,2 ),  and  hypothetical  objects  detected  by  sensor  two 
but  not  by  sensor  one  (j :  1 ,2 ).  Each  hypothetical  track  k  is  assigned  to  an  element,  q ,  f ,  or  s .  Each 
sensor  track  du  (x)  or  d2j  (x)  is  assigned  to  an  element,  q ,  r ,  or  s ,  as  well: 

P(H\D,I)~P{H\  D\\P(dAKMd  27(5)  I 

5:1,2 

*X\P{d,m\K,lWi\K’l)  •  (19) 

r:l,2 

. .  2j(g)  \hqj) 

Next,  the  hypothetical  objects  are  assumed  to  have  definite  values  in  the  feature  space  x .  The 
definite  values  of  the  hypothetical  objects  in  the  feature  space  are  represented  by  flq,  {ir9  and  fls  for  K 
hypothetical  objects. 

P{H  |  D,l)oc  p(H  |  /)! \p{dx  |  hs{Ms\l)P(d  2 7(5)  I  ^5  (/Oj) 

5 

xYlPUu^hMXlW^KiM.ll)  ■  (20) 

r 

i  m  I  hq  {/Aq  l  j(q)  I  hq{flq\l) 

The  hypothesis  information  consists  of  the  associations  between  the  hypothetical  objects  and  the 
sensor  objects,  and  the  hypothetical  object  locations,  flq ,  /lr ,  and  fls  in  the  feature  space.  With  the 
probability  theory  approach,  the  locations  are  considered  to  be  nuisance  parameters  and  not  necessary 
information  if  interest  is  only  in  the  optimal  assignment.  The  standard  approach  is  to  sum  or  integrate  out 
the  nuisance  parameters,  which  here  is  summation  or  integration  over  all  possible  values  (hypotheses)  of 
/iq,  fJ.r ,  and  pls  locations.  Summation  is  used,  with  the  viewpoint  that  the  feature  space  is  finite  and 
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numerable.  Continuous  feature  spaces  are  discussed  later  because  of  additional,  complicating 
considerations.  The  summation  is  mathematically  represented  as 


P(h{aK) |  D,l)~  ^~! •  •  P{H{aic »/A  > • •  •  > Mk ) I  -^’-0  »  (21) 

^1  a* 

where  oK  represents  one  association  list  between  the  hypothetical  objects  and  sensor  objects.  Additional 
subscripts  or  parameters  are  left  off  for  now  to  avoid  complicating  the  equations.  The  association,  aK , 
can  be  considered  to  be  an  element  in  the  set  of  all  possible  associations,  AK .  To  reiterate,  each 
summation  is  over  all  the  values  that  /iq ,  jur .  and  Ms  can  assume.  The  probability  function  expands  to 

P(h(ak ) |  D,l)<x  yrTP(H(ak,M^MK)\ I  hs{Ms\l)P{d 2j(S)  I  K(Ms\l) 

Mi  Mk  5 

Xl]^l/(r)  I  K(Mr\l)P& 2  I  ’  (22) 

r 

xll^rf  w<*>  I  \  {Mq  \^i^2j(q)  I  hq  {Mq  )> 

Note  that  the  probability  of  the  hypothesis,  conditioned  on  the  prior  data,  may  be  a  function  of  the 
summands  and  can  affect  the  resulting  probability.  The  next  assumption  made  is  that 

(A1 1)  the  hypothesis  parameters  in  P(H(aK ,Mk  )  I  /)  are  statistically  independent  of  one 
another  and  are  thus  separable  into  the  product,  P{h{pK  )  |  l)P(h(jlx )  |  /).../*(/*(//*  )  I  *0 » 

leading  to  another  simplification, 

P(h{aK  )|  D,l)x  Tr-T. P{h(aK ) 1 1. 0 P(h(/ls ) |  l)p{d\  \  hs (jls ), l]p{d2Jis)  \  hs (jls ), /) 

Mi  Mk  5 

x Yl P{h{jlr ) I l)p(du(r)  I K {pr ), l)p(d2  \hr(jlr\l)  •  (23) 

r 

xn^)i/W^„  i  2  j(q)  I 

Note  that  a  given  hypothetical  location  really  only  matches  with  one  of  the  probability  duos  inside  one  of 
the  three  product  terms.  The  summations  over  the  locations  can  be  moved  inside  the  products, 
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•  (24) 


P(h{aK)\D,l)»c  p(h{aK )|  /)IIZ7>(/l(^)l  I  I  KM1) 

S  Ms 

. . . 

r  Mr 

XHE ) I  l)p(du{q)  \hqijiq\ l]p{d2J^q)  \hq{/lq\l) 

i  A 

The  use  of  the  characters  q ,  r ,  and  s  for  three  different  products  obscures  this  relationship. 

The  next  assumption  made  is  that 

(A12)  the  assignment  hypotheses  h(aK )  are  conditionally  independent  of  the  prior  information, 
I ,  and  the  probability  P{h(aK )  1 1)  is  uniformly  distributed  across  the  assignment  hypotheses. 

This  term  can  be  ignored,  and  leads  to 
P(h(aK)\  £>,/)“ 

5  Ms 

xnz^)l^k(r)  I  KbrllWl  I  KiMrU)  •  (25) 

r  Mr! 

X II  E  P(h(M«  )  I  7  W^l  m  \hMqlI  )P(d2j(q)  I  hqiMql1) 

?  M, 

This  equation  has  the  form  that  will  be  used  to  estimate  the  most  probable  track  assignment. 
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3.  LINEAR  ASSIGNMENT 


Equation  (25)  can  be  used  to  make  the  intended  connection  to  linear  programming  and  linear 
assignment  algorithms.  If  the  negative  log  of  Equation  (25)  is  taken,  the  products  of  probabilities  convert 
to  summations  of  log-probabilities. 

]nP{h{aK)\ D,l)oc  £ln I  hs{jus\l)p(d2j(s)  \  hs{/lsU) 

S  Ms  _ 

+  •  (26) 
'  Ur 

With  the  equation  transformed  to  a  summation,  the  problem  of  finding  the  assignment  list  can  be 
accomplished  with  linear  programming  algorithms  [2,  3,  11,  12].  This  procedure  avoids  the  need  to 
calculate  all  the  permutations  of  assignments  to  find  the  most  probable.  Instead,  a  matrix  can  be 
constructed  and  a  linear  assignment  algorithm  can  be  used  to  determine  the  most  probable  assignment  aK 
without  having  to  evaluate  every  permutation  of  the  probability  products. 

Shorter  representations  of  the  negative  logarithms  of  the  summations  will  prove  useful  in  the 
continued  derivation: 

Us 

i-2  =  -  In  P(h(/ir )  |  /  )p{dXi(r) \hr{Mr\l)p(d2\hr{pr),l) 

Ur 

t,  =-ta£/#Oi')pfc,>  i*,  M'Hd  2  j(q)  I  M7) 

A  linear  assignment  matrix  can  be  constructed  for  Equation  (26): 


(27) 

(28) 
(29) 
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(30) 


This  matrix  has  four  quadrants.  The  upper-left  quadrant  contains  the  detected  track  association  costs.  The 
lower-left  and  upper-right  quadrants  contain  the  missed  detection  costs  on  the  diagonals.  The  off-diagonal 
elements  of  these  submatrices  are  set  to  infinity  to  prohibit  their  selection  as  an  assignment  and  leads  to 
improved  speed  from  most  assignment  algorithms.  These  two  quadrants  are  always  square  matrices.  The 
lower-right  quadrant  is  filled  with  zeros  to  counterbalance  the  assignment  costs  in  the  upper-left  quadrant. 


An  interesting  aspect  to  the  linear  assignment  problem  is  that  the  addition  of  a  constant  value  to  the 
elements  of  any  row  or  column  does  not  change  the  optimal  assignment  solution.  This  fact  provides  for  a 
way  to  slightly  reduce  the  complexity  of  the  matrix.  It  should  be  noted  that  if  different  assumptions  are 
adopted  for  the  construction  of  the  probability  equation  than  have  been  adopted  to  reach  this  point,  this 
simplification  might  not  be  possible.  This  simplification  is  not  necessary  for  the  linear  assignment 
algorithm  to  operate  but  provides  a  way  to  sometimes  simplify  the  implementation  of  the  algorithm. 


First,  the  thresholds  on  the  diagonals  are  subtracted  from  the  appropriate  rows  and  columns  to 
produce  the  assignment  matrix. 


/  —  /  _  —  -  0  °° 

M  J  **  \J  ^12 

f  —  /  -  —  ^  -  00  0 

^2  J  *  1 J  ^22 

•  •  • 

•  ■  • 

•  •  • 

f  —  —  £  00  00 

00  0  0 

00  0  0 

•  •  • 

•  •  • 

0  0  0 


(31) 


The  slight  drawback  with  this  matrix  is  that  the  optimal  assignment  cost  that  is  calculated  by  most 
linear  assignment  algorithms  is  not  proportional  to  the  negative  log  of  the  assignment  probability  unless 
the  diagonal  probabilities  are  retained  and  added  back  into  the  optimal  assignment  cost. 
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Because  three  of  the  quadrants  now  contain  zeros  or  infinities,  the  linear  assignment  problem  can  be 
solved  with  the  smaller  matrix 


M  = 


^12  ^12  ^  12  ^,2^ 

■^21  ^  Tl  ^22  ^22~^n~  ^22  ^  2J  ~  ~  ^  22 

;  ;  t 

*  .  •  • 

*/i  ~hi  ~^n  ^n~^\2~^n  ^ n. 


(32) 


and  the  matched-pair  indices  of  the  optimal  assignment  that  are  less  than  zero  can  be  accepted  as  the 
assignments  that  should  be  made.  Matched-pair  indices  with  values  greater  than  zero  would  have  matched 
with  the  missed-object  elements  in  the  larger  matrix. 


In  general,  this  smaller  matrix  is  not  square,  so  the  linear  assignment  algorithm  has  to  be  able  to 
find  solutions  to  nonsquare  matrices,  or  zero-element  rows  and  columns  have  to  be  added  to  make  the 
matrix  square  for  linear  assignment  algorithms  that  can  only  solve  square  matrices.  If  the  case  is  the  latter, 
then  matched-pair  indices  that  end  up  in  the  added  rows  or  columns  are  considered  to  be  one  object  that 
was  detected  by  one  sensor  and  not  detected  by  the  other. 


If  the  diagonal  threshold  pair  sums,  ^ ,  are  all  equal  or  chosen  to  be  equal,  the  subtraction 

of  the  diagonal  terms  can  be  ignored  and  the  upper  left  quadrant  of  Equation  (30)  solved  with  a  linear 
assignment  algorithm  capable  of  solving  rectangular  matrices.  If  the  linear  assignment  algorithm  can 
solve  only  square  matrices,  additional  rows  or  columns  with  values  equal  to  the  threshold  pair  sum  are 
added  to  square  the  matrix  to  allow  for  a  solution.  Pair  assignments  with  a  cost  greater  than  or  equal  to  the 
threshold  sum  are  matches  that  are  rejected  because  the  match  with  missed-object  elements  is  equally  or 
more  probable. 
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4.  PROBABILITY  ESTIMATION 


The  basis  for  the  association  algorithm  now  exists;  functions  must  be  obtained  for  the  conditional 
probabilities 


pMmM  . 

(33) 

pfclhM’l)  > 

(34) 

P(d2\hr(MrU)  . 

(35) 

P(dm)  |  hk  (jlk ),  /)  ,  and 

(36) 

P[d2  j(k)  1  hk  (jik ),  /) 

(37) 

Note  that  Equations  (36)  and  (37)  are  probabilities  of  the  data,  conditioned  on  the  hypothesis  and 
the  prior  information,  and  not  probabilities  of  the  hypothesis,  conditioned  on  the  data  and  prior 
information.  The  two  conditional  probabilities  are  different.  The  natural  inclination  is  to  think  that  a 
multitarget  tracker  at  a  sensor  would  be  designed  to  estimate  the  most-probable  hypothetical  tracks  that 
have  been  conditioned  on  the  data  that  the  sensor  has  collected.  For  track  association  between  sensors,  the 
desired  information  is  a  function  that  gives  the  probability  of  the  data,  conditioned  on  the  hypotheses.  The 
distinction  is  subtle,  but  important.  The  importance  depends  upon  the  nature  of  the  probability 
distribution  functions  that  are  used.  The  special  symmetry  property  of  the  Gaussian  functions  allows  for 
the  distinction  usually  to  be  ignored  without  major  consequences.  Interchange  of  the  data  parameters  and 
hypothesis  parameters  in  the  Gaussian  function  results  in  the  same  function.  In  addition,  the  integral  of 
the  Gaussian  function  over  the  hypothesis  parameters  and  the  integral  of  the  Gaussian  function  over  the 
data  parameters  are  equal  to  one. 

The  first  probability  function  selected  is  the  probability  of  the  hypothesis  states  P{H(jlk  )  |  /)  on 

the  feature  space.  Prior  information  may  indicate  that  the  hypothetical  objects  are  less  or  more  likely  to 
occur  in  different  regions  of  the  feature  space.  Commonly,  the  adopted  assumption  is  that 

(A  13)  the  prior  information  does  not  provide  any  indication  that  objects  are  more  or  less  likely  to 
occur  in  any  given  region  of  feature  space.  In  other  words,  the  probability  distribution  for  the  object 
locations  is  a  uniform  distribution  across  the  metric  space.  This  probability  then  can  be  set  to  either 

^)|/)  =  i  (38) 

for  N  elements  of  a  countable  feature  space,  or 
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(39) 


pMa)i/)=£ 

for  a  volume  V  in  a  bounded,  continuous  feature  space.  For  now,  if  the  feature  space  is  unbounded  and 
continuous,  a  suitable  boundary  is  chosen  for  the  integrals  so  that  the  bounded,  continuous  equality  can  be 

used. 

Functions  for  the  remaining  probabilities  require  additional  assumptions  about  feature  spaces,  so 
specific  cases  of  the  association  problem  are  examined.  Real,  bounded,  feature  spaces  are  reviewed  first 
because  the  association  problem  is  most  often  formulated  with  Gaussian  probability  density  functions  in  a 
real  feature  space.  Feature  space  is  considered  to  be  of  an  integer  number  of  dimensions.  It  is  assumed 
that  the  probabilities  P(dm)  \  hk (juk ),/)  and  P(d2J(k)  \  hk(juk)j)  are  Gaussian  functions.  The  trackers 

at  the  sensors  are  assumed  to  generate  a  state  that  provides  a  mean  position  (x)  and  covariance  X . 

Because  Gaussian  functions  are  symmetric  for  interchange  in  juk  and  {x) ,  the  Gaussian  function  from 

the  sensors  can  be  used  for  the  detected  object  probabilities, 


G(juk,{x),l)=  -(*»)  •  (40) 

Since  the  original  calculations  were  carried  out  with  numerable  sets,  we  need  to  convert  the 
summations  to  integrations.  First  we’ll  estimate  the  probability  for  the  hypothesis  location  being  within  a 
small  volume  element  Av .  With  the  uniform  hypothesis  prior  assumption,  the  probability  becomes 


l/(<7)  I  K  {Mq  1  ^{^2  j(q)  \hq[Mq\  l) 

~~~  G[pq  >  (*lf  )>  )^V1  fiiflq  >  (X2j  )>  ^2  j  KV2./ 

Note  the  three  different  Av  terms.  It  is  important  to  distinguish  between  them  because  many  probability 
calculations  have  gone  awry  when  the  terms  have  been  confused.  There  may  be  only  one  feature  space, 
but  there  are  three  parameter  spaces.  The  parameter  space  volume  elements  are  those  parameters  in  the 
probability  term  that  appear  to  the  left  of  the  ‘|’  character. 

The  Avm  term  is  used  to  convert  from  summation  to  integration  over  the  parameter  space  for  fik . 
The  integration  of  the  preceding  probability  results  in  the  approximation 


P(.) «  AviAv2j  expf-l^)  _  (X„  +  Z2,  )■'  {{xu)  -  (x2J  )jl 

F>/|2^(X1(.+X2,)|  ^  2  J 


when  V  is  sufficiently  large.  The  negative  log  of  this  term  is 


(42) 
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The  first  term  of  Equation  (43)  is  one-half  the  Mahalanobis  distance  and  is  sometimes  called  either 
the  covariance-weighted  distance  between  the  two  means  or  the  chi-squared  distance.  The  second  term  in 
the  log-likelihood  involves  the  combined  covariances.  The  third  term  is  an  incremental  length.  The  units 
associated  with  the  second  and  third  terms  cancel. 

Many  assignment  algorithms  have  been  constructed  that  use  only  the  Mahalanobis  distance  term, 
leading  to  algorithms  that  perform  poorly  when  the  covariances  differ  from  track  to  track,  especially 
within  the  set  of  tracks  for  one  sensor.  When  this  term  is  neglected,  the  less  accurate  tracks  steal 
associations  from  the  more  accurate  tracks  because  the  Mahalanobis  distance  is  not  directly  proportional 
to  the  association  probability.  Mathematically,  the  Mahalanobis  distance  is  only  appropriate  when  the 
second  term  of  Equation  (43)  is  constant  for  all  combinations  of  i  and  j .  Appropriate  simple  changes  to 

the  threshold  log-probabilities  l-XJ  and  /j5  have  to  be  made  if  the  Mahalanobis  distance  is  used  instead  of 

a  probabilistic  cost.  Generally,  the  threshold  is  selected  as  a  limit  on  the  number  of  standard  deviations 
before  track  associations  are  unacceptable. 

A  simple,  one-dimensional  numerical  example  can  illustrate  the  problem  with  the  inappropriate  use 
the  Mahalanobis  distance.  Assume  that  sensor  one  reports  one  object  with  mean  and  covariance  (0,1). 
Sensor  two  reports  two  objects  with  means  and  covariances  (1,1)  and  (7,100);  the  second  object’s  error  is 
ten  times  larger  than  the  first  track’s  error.  This  difference  in  accuracy  is  not  uncommon  for  sensor  track 
data.  The  Mahalanobis  distances  for  the  two  possible  associations  are  0.2500  and  0.2426.  The 
Mahalanobis  distance  would  select  the  less  accurate  track  from  sensor  two  as  the  one  that  associates  with 
the  single  track  from  sensor  one.  The  two  association  probabilities  are  0.2196  and  0.0311.  The  odds  are 
7:1  that  the  more  accurate  track  is  the  one  that  associates  with  sensor  one’s  track,  but  the  use  of  only  the 
Mahalanobis  distance  selects  the  less  accurate  track.  Constructing  an  assignment  algorithm  that  uses  only 
Mahalanobis  distance  will  generate  improbable  assignments. 

The  threshold  terms,  lj.  and  /.- ,  must  be  determined,  with  somewhat  more  difficulty  than  that  of 

selecting  thresholds  for  Mahalanobis-distance  assignment  algorithms.  For  these  algorithms,  designers 
usually  select  the  number  of  standard  deviations  to  use  to  set  the  threshold  level.  Threshold  selection  is 
more  difficult  with  the  correct  formulation  with  log-probabilities,  but  provides  a  more  powerful  technique 
for  setting  thresholds  because  additional  factors  that  influence  detection  can  be  incorporated  into  the 
threshold  function. 

If  prior  information  is  available  on  the  sensor’s  sensitivity  to  detecting  objects,  this  jensitivity  can 
be  taken  into  account  in  the  construction  of  the  functions  for  p{dx  \  hs(jls),l)  and  p{d2  |  hr(jlr\l). 

This  sensitivity  can  provide  very  powerful  information  with  significant  impact  on  the  optimal  assignment. 
In  this  example,  it  is  assumed  that  this  information  is  not  available,  and  suitable  approximations  to  the 
threshold  terms  are  constructed.  The  detected-object  probabilities  are  assumed  to  be  Gaussian  functions 
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P{h(Ms) I l)P(dx  I I *,(/U')= 


AV-^k  I ^(^J^^,(x2,),S2y)Av2 


(44) 


The  probability  of  a  missed  detection  is  assumed  to  be  independent  of  the  hypothetical  location. 
The  integration  over  Av^ ,  where  V  is  sufficiently  large,  leads  to  the  result 


£1.=-  In  {p(dx ))-  ln(Av2 . /v)  .  (45) 

The  other  threshold  term  produces 

P(h(fl,)\l)p(dm \h,(jl,\l)p(d2 \hr(/lr\l)= 

±AvrG(Mr,(xu),llM32\KMl)*'’u  (  ’ 

and 


*Ty  ®  -  ln(p(d2 ))  -  ln(Av,,  /  V )  .  (47) 

Because  pairs  of  thresholds  are  compared  against  associations,  the  Avu  and  Av2y  terms  appear 

once  in  all  hypothetical  association  probabilities.  They  can  be  ignored.  The  same  is  not  true  for  the 
volume  term  V  from  the  uniform  prior  of  the  hypothesis.  The  threshold  pairs  contribute  two  \!V  terms, 
while  the  association  term  contributes  one. 

One  possibility  is  to  remove  the  \IV  term  from  the  association  log  likelihood,  and  distribute  the 
remaining  1  /  V  term  across  the  thresholds  if  the  reduced  linear  assignment  matrix  is  to  be  used: 

4=5(<^>-(^)n^+^)‘'(<^>-(^})+i1”Ns'-+^l)  ■  <48) 

<7;=-lnWrf,))+iIn(K)  .  (49) 

(T^-ln(p(rf2))+ita(r)  .  (50) 

The  volume  element  is  problematic,  especially  when  the  derivations  are  extended  to  an  unbounded 
feature  space.  All  the  log-likelihood  functions  have  a  limit  on  infinity  for  infinite  feature  spaces  if 
Equations  (43),  (45),  and  (47)  are  used.  If  Equations  (48)  through  (50)  are  used,  the  limit  leads  to  the 
mavimnm  number  of  associations  being  made  between  the  tracks  from  the  two  sensors  because  the  limits 
for  the  thresholds  are  infinities.  One  approach  to  this  problem  is  to  use  the  argument  that  follows. 
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An  infinite  number  of  hypothetical  objects  are  assumed  to  be  in  infinite  space;  their  density  is 
\IVp.  It  is  next  assumed  that  the  volume  associated  with  the  density  is  sparse  enough  that  the  tracks 

observed  by  the  sensors  are  individual,  distinguishable  objects.  The  probability  density  for  a  hypothetical 
object  that  is  detected  by  a  sensor  is  uniform  over  this  volume.  The  density  function  integrals  with  this 
prior  probability  function  are  assumed  to  be  reasonably  approximated  by  the  integral  of  two  Gaussian 
functions  over  an  infinite  volume,  with  file  probability  densities  of  Gaussian  functions  falling  off  rapidly 
enough  that  the  regions  with  zero  prior  probability  contribute  little  to  the  integral.  This  process  matches  in 
spirit  the  derivations  of  Stone  et  al.  [4],  who  chose  that  the  value  of  Vp  is  a  volume  equivalent  to  three 

standard  deviations  of  the  combined  Gaussian  covariances.  For  example, 


V,~ 


9^max(Z1(. ) + max(s2 }  jj 


1/2 


(51) 


could  be  an  equivalent  technique  to  estimate  a  region  for  a  hypothetical  object.  The  thresholds  and 
association  probabilities  then  can  be  calculated  with  Vp ,  replacing  V  in  the  appropriate  equations. 


4.1  ASSOCIATION  WITH  OTHER  FEATURES  BEYOND  METRICS 

The  use  of  metric  features  for  the  association  is  limited  by  the  resolution  of  the  sensors.  If  metric 
accuracy  cannot  be  improved,  improved  performance  from  the  association  algorithm  can  be  achieved  only 
by  considering  other  feature  information  in  addition  to  the  metric.  The  additional  measurements,  or 
features,  can  be  used  to  strengthen  the  association  between  tracks  or  to  prevent  association  when  it  is 
likely  that  two  different  objects  are  actually  very  close  together  that  should  not  be  associated.  This 
situation  is  probable  to  occur  for  closely  spaced  targets,  one  object  of  a  type  that  can  be  observed  by  one 
sensor  but  not  the  other  sensor,  and  the  other  object  of  a  type  observed  by  the  other  sensor  but  not  the 
first.  If  there  are  measurable  features  that  can  provide  enough  probabilistic  evidence  that  the  two  tracks 
are  two  different  objects,  then  the  additional  features  can  prevent  the  association. 

Adding  feature  information  is  relatively  straightforward,  at  least  in  terms  of  the  association 
algorithm.  The  real  work  lies  in  estimating  the  probabilities  for  the  new  feature  information.  The 
adding  association  evidence  arises  from  the  nature  of  the  logarithmic  function, 

ln(/>e)  =  ln(/>)+ta(e)  . 

If  the  probability  distributions  are  separable, 

P{x,f  |  H,l)  =  P{x  |  H,l)P{f  |  x,H,l)  =  P(x  |  Hj)P{f  |  H,I)  , 

then  the  association  cost  matrices  can  be  calculated  independently  and  simply  added  together 
linear  assignment  operation.  Unless  there  are  doubts  on  the  independence  of  the  probability 


ease  of 

(52) 

(53) 

for  the 
density 
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functions  or  the  belief  that  the  probabilities  of  the  features  or  the  metrics  are  not  realistic,  there  is  no 
multiplicative  scale  factor  for  either  the  metric  or  the  feature  cost  matrix. 

An  example  of  feature  space  separability  can  be  used  to  demonstrate  the  separation  by  making  two 
assumptions: 

(Bl)  the  two  feature  spaces  are  statistically  independent,  and 

(B2)  the  probability  of  not  detecting  a  track  is  independent  of  the  new  feature  space,  given  by  / . 

Assumption  (B2)  is  made  here  only  to  provide  a  specific  example  assumes  that  the  feature  space  /  has 
no  influence  on  the  detectability  of  the  targets.  There  are  strong  reasons  not  to  assume  (B2),  if  possible.  It 
is  preferable  to  be  able  to  account  for  differences  between  the  types  of  targets  that  the  two  sensors  are 
able  to  detect.  The  added  information  improves  the  probabilities  that  the  right  assignments  will  be 
selected.  The  potential  difficulty  is  that  the  missed-detection  probability  may  not  be  separable  into  two 
products,  complicating  the  construction  of  the  total  cost  matrix.  If  the  probabilities  are  fully  separable  for 
all  terms,  then  the  total  cost  matrix  can  be  constructed  with  the  addition  of  independent  matrices, 
traditionally,  one  for  the  metric  and  one  for  the  additional  feature  costs. 

Given  the  adoption  of  Assumptions  (Bl)  and  (B2),  the  log-likelihood  functions  expand  to 

hj  =  "ln  I  hs{jis\l)p(d2m  I  hM,l)p{d2m  I  hs{fs\l)  (54) 

M.J, 

L-2  =  -In  £/>(/^r)|  l)P{h{fr)\ l)P{dU(r)  |  hr(jur\l)p(dmr)  |  hr{fr\l)p(d 2 1  hr(vr\l)  (55) 

Mr  Jr 

l a  =-ln  l)p{h{f^)\ l)p(dU(q)  |  hq 

Mr,  J,  (56) 

and  separate  to 


Ms 

fs 

l,2  =-foY,P(h(M,)lI)Pk«rr\K(Mrlt)P&2\>'Ml) 

Mr 

fr 
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t,  =  -1h£p(a(a,)|  I  hMM) 

t** 

-  hi  2  p(h[f, )  I  /)/>(<4<„  |  l)p{dmt)  I  h,(ft\l) 

fq 


(59) 


If  the  prior  probability  P{h{fk  ))  is  a  uniform  prior  of  1  /F ,  then  the  feature  costs  simplify  to 


/  _  =  £  -  =  Ini7 

fit 


(60) 


f 


V 


\ 

F  /,  J 


(61) 


The  similarities  of  the  feature  probability  distributions  increase  or  decrease  the  likelihood  of  association 
between  tracks. 


Note  that  if  all  the  tracks  provide  no  evidence  for  or  against  the  preference  of  any  of  the  object 
types,  the  probabilities  are  the  uniform  distribution,  1  IF.  The  feature  costs  then  reduce  to 


ifij=2\n{F)  ,  (62) 

which  produces  a  matrix  that  provides  no  additional  evidence  for  or  against  association.  The  likelihood 
for  the  lack  of  feature  evidence  in  Equation  (62)  is  balanced  by  the  product  of  the  two  missed-detectiori 
costs  in  Equation  (60). 

This  example  holds  true  only  for  those  cases  where  the  probability  of  detection  is  independent  of 
the  feature  subspace.  If  the  probability  of  detection  depends  on  the  feature  subspace,  then  costs  are 
different  and  provide  more  information  as  to  what  objects  should  or  should  not  be  associated.  It  provides 
for  the  ability  to  prevent  association  between  closely  spaced  targets  if  the  evidence  supports  sufficiently 
different  target  classes,  especially  if  the  two  sensors  are  better  able  to  detect  different  classes  of  targets. 

One  difficulty  with  additional  features  is  that  the  entire  cost  matrix  must  be  populated  with 
estimates  that  are  determined  from  probability  density  functions.  If  feature  information  is  missing  from  a 
track,  a  suitable  probability  estimate  still  must  be  selected,  such  as  a  uniform  probability  distribution.  The 
threshold  probabilities  need  to  be  calculated  for  tracks  with  missing  feature  data  as  well. 

A  weakness  with  using  class  or  identification  (ID)  probability  vectors  as  the  additional  feature 
subspace  is  that  the  feature  space  is  usually  not  large  enough  to  have  much  of  an  influence  on  the  optimal 
associations  in  comparison  to  the  influence  of  the  metric  association  matrix.  The  metric  space  is  usually  a 
real  space  of  two,  three,  six,  or  even  nine  dimensions.  The  class  or  ID  space  tends  to  be  an  integer  space 
with  a  finite  span,  usually  with  only  a  few  tens  of  different  classes  or  IDs.  Even  if  the  feature  space  is 
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relatively  large,  the  probabilities  must  be  extremely  high  or  low  to  drive  the  associations  that  are  made  in 
comparison  to  the  metrics. 

4.2  SPECTRAL  FEATURE  SPACES 

Many  researchers  have  attempted  using  spectral  information  from  radar  cross  section  or  radiometric 
intensity  measurements  as  a  means  to  associate  tracks  between  sensors,  but  in  the  opinion  of  this  writer, 
very  few  efforts  have  come  close  to  succeeding.  Their  knowledge  of  track  association  and  its  relationship 
to  probabilities  makes  it  easier  to  sense  why  this  technique  has  been  difficult.  One  reason  for  the  difficulty 
is  the  need  for  a  way  to  transform  a  pair  of  spectra  from  two  tracks  into  an  association  probability: 

(63) 

for  two  spectra,  Su  and  Stj .  This  need  has  generally  been  neglected  in  the  problem  definition.  Another 

reason  for  the  difficulty  with  spectra  association  is  that  the  frequency  peaks  in  the  two  spectra  are  only 
loosely  correlated  to  each  other.  The  observed  peaks  fQ  are  usually  a  function  of  the  spin  and  precession 

frequencies, 


f0=Nfs+Mfp  ,  (M) 

where  N  and  M  axe  positive,  0,  or  negative  integers,  as  long  as  fQ ,  fs,  and  fp  are  positive.  The 
observed  frequency  peaks  vary  between  sensors,  depending  on  the  nature  of  the  target,  sensor 
characteristics,  and  viewing  geometries. 

The  desire  to  use  spectral  data  for  track  association  is  partially  driven  by  human  nature.  The  peaks 
are  often  easy  to  discern.  Many  data  in  the  frequency  plot  lead  to  the  assumption  that  there  are  a  large 
number  of  possible  configurations  to  the  plots  and  that  very  small  association  probabilities  should  be 
obtainable  when  the  sensors  are  observing  different,  dissimilar  objects.  Reconsidering  the  argument  that 
the  frequency  spectra  contain  a  lot  of  information  (data),  it  could  be  argued  that  the  actual  time-series 
measurements  should  have  even  more  data  that  can  be  used  to  estimate  even  smaller  association 
probabilities  for  different,  dissimilar  objects.  Most  researchers  quickly  recognize  that  it  is  difficult  to 
estimate  association  probabilities  with  raw  data  because  the  information  in  the  raw  data  plots  is  more 
difficult  to  discern,  and  that  it  is  even  more  difficult  to  construct  an  association  measure  for  the 
information  in  the  measurement  data  from  two  tracks.  One  way  to  simplify  the  construction  of  a 
probabilistic  association  measure  is  to  create  algorithms  that  extract  the  rotational  and  precessional 
frequency,  simplifying  the  generation  of  probability  density  functions  for  the  spin  and  precession 
frequencies.  The  algorithm  must  recognize  that  the  frequencies  from  one  sensor  may  be  a  rational  fraction 
of  the  corresponding  measured  frequencies  observed  for  the  track  from  the  other  sensor. 
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5.  SUMMARY 


This  report  has  presented  a  new  approach  to  deriving  track  association  algorithms  by  starting  with  a 
basic  probability  function  and  using  the  axioms  and  theorems  of  Bayesian  probability  theory  to  expand 
the  basic  function  into  the  necessary  format  for  track  association.  The  technique  accounts  for  all  the 
assumptions  that  are  necessary  to  arrive  at  a  derivation  of  a  track  association  algorithm.  The  derivation 
carried  out  here  results  in  a  typical  track  association  algorithm.  It  highlights  the  problems  that  are  often 
encountered  with  track  association  algorithms.  The  derivation  also  provides  a  template  that  can  be  used  to 
construct  other  track  association  algorithms;  the  interested  algorithm  developer  can  change  the 
assumptions  and  work  through  a  new  derivation  to  arrive  at  a  different  algorithm.  The  developer  can  also 
derive  more  complex  algorithms  with  the  same  derivation  and  assumptions,  but  can  use  more  complex 
(informative)  probability  density  functions.  Variations  beyond  the  simple  examples  provided  here  can 
include  changing  such  things  as  the  form  of  the  track  probability  density  functions,  detection  probability 
functions,  feature  spaces,  and  feature  space  decomposition. 

Additional  knowledge  about  the  characteristics  of  the  objects  and  the  sensors  is  useful  if  it  can  be 
incorporated  into  the  association  algorithm.  If  accurate  prior  information  about  the  objects  is  available, 
the  information  can  be  incorporated  in  the  prior  probability  functions  used  in  the  derivation.  The 
derivation  has  to  be  redone  to  get  the  appropriate  functions  for  construction  of  the  association  matrix.  If 
the  detection  characteristics  of  the  sensors  are  known  and  can  be  embodied  in  the  detection  probability 
functions,  better  association  performance  can  be  achieved. 

If  the  intended  application  severely  violates  any  of  the  assumptions  used  to  derive  the  association 
matrix  equations,  then  the  derivation  must  be  redone  with  the  new  set  of  assumptions  in  order  to  derive  a 
workable  association  algorithm.  Whether  the  new  assumptions  lead  to  a  linear  programming  algorithm 
depends  on  the  new  assumptions.  A  new  assignment  algorithm  can  be  constructed  by  starting  with 
Equation  (1)  and  carrying  out  the  new  derivation  with  a  different  set  of  assumptions  and  prior 
information. 

Although  the  focus  of  this  report  has  been  on  the  derivation  of  track  association  algorithms  with 
Bayesian  probability  theory  and  linear  assignment  algorithms,  there  is  no  intention  to  imply  that  this 
approach  is  the  only  way  to  develop  track  association  algorithms.  It  is  certainly  possible  to  construct  track 
association  algorithms  with  other  theories,  such  as  statistical  decision  [14]  or  Dempster-Shafer  evidential 
reasoning  [15].  Little  work  has  been  done  with  track  association  algorithms  in  these  areas  at  this  time  and 
most  track  association  algorithms  that  have  been  developed  to  date  can  be  more  easily  related  to  the  track 
association  algorithm  that  has  been  outlined  in  this  report.  Bayesian  probability  theory  provides  the  best 
guidance  to  the  construction  of  track  association  algorithms. 
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